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Abstract — A problem of distributed state estimation at mul- 
tiple agents that are physically connected and have competitive 
interests is mapped to a distributed source coding problem with 
additional privacy constraints. The agents interact to estimate 
their own states to a desired fidelity from their (sensor) measure- 
ments which are functions of both the local state and the states at 
the other agents. For a Gaussian state and measurement model, 
it is shown that the sum-rate achieved by a distributed protocol 
in which the agents broadcast to one another is a lower bound on 
that of a centralized protocol in which the agents broadcast as if 
to a virtual CEO converging only in the limit of a large number 
of agents. The sufficiency of encoding using local measurements 
is also proved for both protocols. 

I. Introduction 

We consider a network of K distributed agents in which 
each agent observes sensor measurements from a distinct 
part of a large interconnected physical network. Examples of 
such networks include cyber-physical systems, specifically the 
smart grid, in which an agent can be viewed as a regional 
operator whose power measurements are affected by those at 
other agents due to the physical grid connectivity. Agent k is 
interested in estimating the state (defined as a set of system 
parameters; for e.g., voltages and phases in the electric grid) 
of its local network from its measurements, Yk, which are a 
function of both the local state and the states Xi, I ^ k, 
l,k E {1,2, . . . , K} of other agents in the network where the 
states Xk are assumed to be independent of each other 

Estimating Xk at agent k with high fidelity requires the 
agents to interact and share data amongst themselves. While 
the estimate fidelity is crucial to the control decisions made 
by the agents, in many distributed systems, for competitive 
reasons, the agents wish to keep their state information private. 
This leads to a problem of competitive privacy which captures 
the tradeoff between the utility to the agent (estimate fidelity) 
that can be achieved via cooperation and the resulting privacy 
leakage (quantified via mutual information). 

Mapping utility to distortion and privacy to leakage quanti- 
fied via mutual information, one can abstract the competitive 
privacy problem as a distributed source coding problem with 
additional leakage constraints. The set of all achievable rate- 
fidelity-leakage tuples determines the utility-privacy tradeoff 
region. In [1], we introduced and studied this problem for a 
two-agent interactive system with Gaussian states and noisy 
Gaussian measurements. We proved that side-information 
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(measurements at the other agent) aware Wyner-Ziv encoding 
[2] at each agent achieves both the minimal rate and the 
minimal leakage for every choice of fidelity (quantified via 
mean-squared distortion). 

Even without additional privacy constraints, the problem of 
determining the set of all rate-distortion tuples in a multi agent 
network is related to the distributed source coding problem 
[3], [4] which remains open. Furthermore, for a relatively 
simpler setting obtained by assuming that a central entity, 
often referred to as a chief executive officer (CEO), wishes 
to estimate the states Xk, for all k, from the transmissions of 
all agents, we obtain a multi-variate (vector) Gaussian CEO 
problem which also remains open except for specific cases [5]. 

Circumventing these challenges, we focus on the rate- 
distortion-leakage behavior in the limit of large K for a 
distributed protocol in which each agent encodes its measure- 
ments taking into account the prior broadcasts of the other 
agents (henceforth referred to as progressive encoding) as well 
as the side-information at the other agents. We compare the 
performance of this protocol with a centralized protocol in 
which the agents broadcast their encoded messages as if to 
a virtual CEO. We consider a noisy Gaussian measurement 
model at each agent with the same level of interference from 
the states of the other agents. For this symmetric model, our 
results demonstrate that the sum-rate achieved by distributed 
protocol outperforms that for the centralized schemes with 
asymptotic convergence with K. We also prove the sufficiency 
of encoding local measurements for both protocols and present 
outer bounds for the per user rate and leakage. 

The paper is organized as follows. We introduce the model 
and communication protocols in Section |ll] In Section Hill we 
develop the achievable rate-distortion-leakage tuples for both 
protocols as well as outer bounds. We conclude in Section HV] 

II. Preliminaries 

A. Model and Metrics 

We consider a network of K agents such that, at any time 
instant i, i = 1,2, . . . ,n, the measurement Yk,i at agent k, k = 
1,2, ... ,K, is related to the states Xm,i, m = 1,2, . . . ,K, al 
the agents as follows: 

K 

Yk,^^Xk,,+ VhXl,, + Zk,^, k = 1,2,..., K, (1) 

l = l,l^k 

where the state variables Xm,i Af{0,a^), for all m and 
i are assumed to be independent and identically distributed 



(i.i.d.) and are also independent of the i.i.d. noise variables 
Zk,i ~ A/'(0, 1). The coefficient > is assumed to be fixed 
for all time and known at all agents. We assume that the /c*'' 
agent observes a sequence of n measurements Y^" — [Y^ i 
yk,2 ■ ■ ■ Yk,n], for all k, prior to communications. 

Utility: For the continuous Gaussian distributed state and 
measurements, a reasonable metric for utility at the fc*'' agent 
is the mean square error Dk between the original and the 
estimated state sequences X'^ and X"^, respectively. 

Privacy: The measurements at each agent in conjunction 
with the quantized data shared by the other agents while 
enabling accurate estimation also leaks information about the 
other agents' states. We capture this leakage using mutual 
information. 

B. Communication Protocol 

We assume that each agent broadcasts a function of its 
measurements {distributed procotol) to all agents and they do 
so in a round-robin fashion. We assume that all agents encode 
in one of the following two ways: i) local encoding in which 
each agent quantizes only its measurements; or ii) progressive 
encoding in which each agent encodes and transmits taking 
into account both its measurements and prior communications 
from other agents. In both cases, the agents transmit at a rate 
that takes into account the correlated measurements and prior 
communications of other agents. 

To better understand the advantage of the above distributed 
procotol, we also consider the case where the agents broadcast 
as if communicating with a virtual central operator, say CEO, 
henceforth referred to as the centralized protocol. This may 
be viewed as the case in which the computing power at the 
agents is limited and the CEO shares with each agent its 
received messages (which are then decoded at each agent). For 
either protocol, the encoding can be either local or progressive. 
Let Ip G {0, 1} and /e„c G {0, 1} be random variables 
that denote the choice of protocols and encodings such that 
Ip = 1 and Ip = for the distributed and centralized protocol, 
respectively, and Ignc — 1 and Ignc — for the progressive 
and local encoding, respectively. 

Formally, the encoder at agent k maps its measurements to 
an index set J'k where 



Jk = {l,2,...,Jk}, k^l,2,...,K, 



(2) 



is the index set at the k*^ agent for mapping the measurement 
sequence, and the prior communications (progressive encod- 
ing), via the encoder fk, k — 1,2, . . . K, defined as 



fk : Yk X leuc ■ Yl^Zljl ^ Jk, 



(3) 



such that at the end of the K broadcasts, one from each agent, 
the decoding function Fk at the fc*'' agent (or the CEO) is 
a mapping from the received message sets (both protocols) 
and the measurements (the distributed procotol) to that of the 
reconstructed sequence denoted as 



Let Mk denotes the size of Jk- The expected distortion Dk at 
the k*-^ agent is given by 



Dk = 



n , 



Xki — Xk 



,k = l,2,...K, (5) 



The privacy leakage, L^,''', about state k at agent I, I ^ k, is 
given by 



(0 



1 



I {X^; Ji, J2,--., JK,Yn , for all k^l. (6) 



The communication rate of the /c*'* agent is denoted by 

Rk ^n-Hog2Mk,k = 1,2,..., K. (7) 

Definition 1: The utility -privacy tradeoff region is the set 

for 



of all pi, . . .,Dk,L^^\ . . . ,£r^ ■.■,Lj, 
which there exists a coding scheme given by (l2|i-(|4| with pa- 
rameters {n, K, Ml, M2, Di+e, . . . , Dk+£, Li+e, . . . , Lk + 
e) for n sufficiently large such that e as n 00. 

III. Main Results 

We use the following proposition, lemma, and function 
definition in the sequel to compute the achievable distortions 
and rates. 

Proposition 1: For (column) vectors A and S, let Kaa ~ 
var{A) = E[{A~ E[M){£ ' E[£])] and = 
E[{A-E {B^ - E [B^])] denote the covariance and 
cross-correlation matrices, respectively. The conditional vari- 
ance E[var{A\B_)] is then given as E[var{A\B_)] — Kaa — 

KabKsIkIs- ~ 
Lemma 1: For a. K x K symmetric Toeplitz matrix whose 

diagonal entries are all a, and off-diagonal entries are all h the 

determinant is (a + [K ~ l)b){a~ b)^^^^-* . 

Proof: The determinant is obtained by the following two 

operations: i) add columns 2-K to column 1, and ii) subtract 

row 1 from each of the remaining rows. ■ 
Definition 2: For some a, /3 G 7?.+ , the function /i (fc, c) = 

a + (fc - 2) /? - (fc - 1) c varies over k e [1,K] and c G 7?.+ . 

A. Distortion 

We assume that each agent has the same distortion con- 
straint D. The distortion D at each agent ranges from a mini- 
mum achieved when it has perfect access to the measurements 
at all agents to a maximum achieved when it estimates using 
only its own measurements. From the symmetry of the model 
in ([T]l, the minimal (resp. maximal) distortion achieved at each 
agent is the same. Let -Dmin and Z^max denote the minimal 
and maximal distortions, respectively, at each agent. For the 
Gaussian model considered here with minimum mean square 
error (MSE) constraints, we have 

i?„,in = E [var{Xi . . .Yk)] , and (8) 
D,^^^ = E[var{Xi\Yi)]. (9) 



-(1) 



(K-l) 



Fk:JiX...xJKX {y^ ■ Ip) 



k = l,2,...,K. (4) 



a EE E{Y^) = (t|(1 + /i [K - 1)) + 1, for all / (10a) 
/3 = E[YiYk) = ct| {2Vh + h{K-2)), I ^ k. (10b) 



Note that for large K, a ^ h{K-l)(7\, and jS 
h{K~2)cj\. 

Computation of -Dmax: Expanding (|9|l, we obtain 

„2 



For large X, 



^[mr(Xi|yi)] 1 



'X- 



Computation of Dnun- Expanding (O, we have 

D,-,in^ E[variXi\YiY2...YK)] 
\E[var{XiY2...YK\Yi)]\ 



(11) 



(12) 
(13) 



\E[var{Y2...YK\Yi)]\ 

where the simplification in ( fTsT l results from the assumption 
of jointly Gaussian random variables. Applying Lemma [T] for 

ci ^ <7x - <7%/a, C2 = <Tx (yji- (3/aj , (14) 
C3 = a - and C4 = /3 - (15) 

we obtain the minimum distortion Dmin as 



V 



(16) 



Remark 1: For K 00, -Dmin -Dmax(l — (1 — V7i)^//i). 



fi. Distributed Protocol 

A general coding strategy for this distributed source coding 
problem needs to take into account: a) the order of agent 
broadcasts; b) multiple encoding possibilities at each agent 
depending on whether the received data is used alongwith 
local measurements in encoding; c) exploiting the correlated 
measurements at other agents in broadcasting just sufficient 
data for other agents to achieve their distortions; and d) mul- 
tiple rounds of interactions. We present a distributed encoding 
scheme with a single round of communication (for simplicity 
of analysis) in which the agents broadcast in order (the source 
permutation choice is irrelevant due to the symmetry of the 
model). The local and progressive coding schemes differ in 
including the received data in encoding at each agent, while 
the centralized and distributed protocols differ in whether they 
exploit the correlated measurements at the other agents. 

The achievable distortion D in general depends on the 
encoding scheme chosen. Let Rk and Rk denote the rates 
for the local and progessive encoding schemes, respectively. 
We first consider the progressive encoding scheme in which 
each agent broadcasts (to all other agents) a noisy func- 
tion of both its measurements and prior communications. 
More precisely, agent k maps its measurement and prior 
communication sequences to one among a set of 2"^*= [/^ 
sequences chosen to satisfy the distortion constraints. The 



UJ^ sequences are generated via an i.i.d distribution of Uk,i 
for all i such that Ui,i = Yi,; + Qi^i and for all fc > 1, 
Uk,i = Yk,^+YAZl ak^ltJl.^+Qk,^ whcrc Qkj e 7^, and Qk.^ - 
N (0, CTq) is independent of Yfc ^ for all fc = 1, 2, . . . , K, and 
i = 1,2, ... ,n. 

The achievable distortion D at agent fc as a result of 
estimating its state using both its measurements Y"^" and the 
received sequences Up, for aU / 7^ fc, is such that D E 
[^min, £'max] where -Dmax IS achieved when [/" = for all 
I and D — Dmin for (Jq — 0. On the other hand, for the 
local encoding scheme, let Uk,i = Yk^i + Qk,i, for all fc and 
i, such that agent fc maps only its measurement sequences to 
one among a set of 2"^*= UJ^ sequences chosen to satisfy the 
distortion constraints. 

Theorem 1: The sets T) of all achievable distortions D for 
the local and progressive encoding schemes for the distributed 
protocol are the same. 

Proof: For Gaussian codebooks and Gaussian measure- 
ments and from symmetry of the model, the distortion D at 
each agent is given by 



L» = E 



(^il 



YiUiU2U^...Uk 



E [var {Xi\YiUiU2U^ . • . Uk)] G [D^in 



(17) 
(18) 



where in ( fTTI i we have used that fact that Ui = Ui, and 
conditioned on Ui,it suffices to condition on [/2, and similarly 
for the remaining Uk, k > 2. ■ 
Computation of D: Using the independence of the quanti- 
zation noise Qk for all fc, as well as the independence of Qk 
and Xk, we have E [UkUi] = E [YkY] = jS for alU =^ fc and 
E [U^] = E [Y^] + E [Ql] = a + cr^. Thus, D is obtained 
in a manner analogous to the calculation of Dmin with the 
replacement of C3 by C3 + (Jn- Thus, we have 



D = Dr, 



\ 



(19) 



Rate Computation: We consider a round-robin protocol in 
which agent 1 broadcasts a quantized function of its measure- 
ments and prior communications at a rate which takes into 
account all the side information at all other agents. Thus, the 
rate Ri required is the maximal of the rates required to each 
agent and is given by 

Ri > I{Ui]Yx) - min (/(C/i; Fa), ■ • • , I{Ui]Yk)) (20a) 
= /(C/i; Fi) - I{Ui;Y2) = Ri (20b) 



where (I20bb follows from the symmetry of the measurement 
model, the fact that Ui = Ui, and i?i is the minimal rate 
required at agent 1 for the local scheme. Next, agent 2 
analogously broadcasts a function of its measurements at a 



rate i?2 given by 

R2>I[tj2]Y2Ui)- min I{U2]YiUi) (21a) 
ie{i,...,K}.i^2 



min I{U2]Yi\Ui) (21b) 
ie{i.^...,K}d^2 



= I{U2]Y2\Ui 

= I{U2;Y2)-I{U2;Y,) = R2 (21c) 

where (|2Tcll follows from h{U2\YiUi) - h{U2\Y2Ui) = 
h{U2\Yi) - h{U2\Y2) since U2 - Y2 - Ui form a Mai-kov 
chain and due to the symmetry of the model. It can be verified 
easily that the bound in ( I21cb is the minimal rate R2 for the 
local encoding scheme. One can similarly show that the rate 
at which agent 3 broadcasts is 



R3>IiU3;Y3UiU2 

= I{U3;Y3) - I{U3:YiU2) = R3 



min /(J/a; Y1C/1C/2) (22a) 
ie{i,...,K},i^3 



(22b) 



where we have used the fact that U3 — Y3 — U1U2 and C/i — 
Yi — U3 form Markov chains. Generalizing we have, for all 

fc > 1, 

Rk ^Rk> I{Uk; Yk) - I{Uk;YiUi . . . Uk-i), (23a) 



where the bound in (12 3 al l is the minimal rate at which agent 
k is required to broadcast when it only encodes Y^". 

Calculation of Leakage: For the proposed progressive en- 
coding, the leakage of the state of agent k at any other agent 
j 7^ k, for all such k,j, is bounded as 
1 



L 



-I{XI-YP^J2...Jk), J^k 



(24a) 



> I{Xi;Y2Ui ...Uk)^ I{Xi-Y2Ui ...Uk) (24b) 



■log 



(24c) 



where (124b I ) is a result of the model symmetry, the code 
construction and typicality arguments and is omitted for 
brevity. The bound in ( I24cb follows from the relation of 
the code constructions for the two encoding schemes and 
C^iP-Vhalf/ {a-al)+hal. 

Theorem 2: It is sufficient to encode the local measure- 
ments at each agent in the distributed protocol. 

Theorem |2] follows directly from the fact that for Gaussian 
encoding, from ( fTSl ), ( I23al i. and ( I24cl ). we have that the set 
of all rate-distortion-leakage tuples achieved by the local and 
progressive encoding schemes is the same. 

The sum-rate of the distributed scheme R^^^l — J2k=i^k 
can be simplified as 



K 



Rsum = h {U2U3 . . . Uk\Yi) + h{Ui\Y2) - - log (27rea^) 




(25b) 



+ - log ( (/i {K, iS^a) + a%)l (a + a| ~ /?)) 



where ( |25b| i is obtained from ( |25a| ) by determining 

\E{var{\lj^\Y{)\\ where C/^_i = {U2U3 ■■■ UkV de- 
notes a column vector of length [K—l). By expanding 
E \yar (iZif-i l^i)] using Proposition [T] one can verify that 
\E [var (L[^|yi)]| simplifies to finding the determinant of the 
{K — 1) X {K — 1) Toeplitz matrix with diagonal and off 
diagonal entries a + CTq — ^ and /3 — ^ , respectively, which 
from Lemma[T]is given by /i (if, 0^ jo) (a + CTq — /3)'^~^^ 
One can similarly show that E \var {U\^2)\ = oi+ctq— 0^ / a- 
In the Hmit of /-C ^ 00, (if - 2) /3 - (iC - 1) ^ ^ 0, 
a — P'^ /a h, a — (3 h, and therefore, the second and 
third log terms in (I25bb scale as log (K) . Thus, in the limit. 



the per agent rate R = R^^m /K is, given by 



lim R — — loe 

K^oo 2 ^ 



-/3 



(26) 



'Q 



C. Distributed vs. Centralized 

We now compare the distributed protocol to a centralized 
protocol in which each agent broadcasts at a rate intended 
for a (virtual) CEO, and thus, is oblivious of the correlated 
measurements at the other agents. Here again, the agents 
can use a progressive encoding scheme analogously to the 
distributed protocol. As in the distributed protocol, here too 
one can show that a local encoding scheme suffices, in which 
agent k generates a codebook J7^' whose entries Uks are 
generated in an i.i.d fashion such that Uk^i — Y^ i + Qk^i, 
Qk.i is independent of Y^.i and Qi,i, for all I ^ k, for all k, 
and for all i. The compression rates are bounded as follows. 
First, agent 1 transmits its quantized measurements at a rate 
i?i such that for error-free decoding of J7" at the decoder, we 
require 

Ri>nUi:Y,). (27) 

Agent 2 takes into account the knowledge of C/f at all agents 
and broadcasts at a rate 



R2>I{U2\Y2)~I{U2;U^). 



(28) 



Note that the agents broadcast taking into account the prior 
transmissions (as if to a CEO) but not the side information at 
the other agents. Continuing similarly, we have for all fc > 2, 

Rk>I (C/2; Yu) - I (Uk-, U1U2 . . . Uk-i) . (29) 

The resulting sum rate R'^^m — Tlk^i can be simplified 

as 



R 



CEO 
sum 



h{UK,UK- 



■ log ' 



(30) 
(31) 

(32) 



(a + - /?) 



Thus, the rate on average per user is R'^^'-' — RfJ^^/K 
which converges in the Hmit of a large number of agents K 
to 



1 



K 



lim = - log 



(33) 



Comparing ( |25b| i and ( [32] i. we can verify that for every 

sum 



> R^um ■ Furthermore, one 



choice of CTq, and hence D, R 
can also show that the leakage at each agent for the centralized 
protocol is the same as the distributed protocol in ( |24] | and is 
the same for both the local and progressive encoding schemes. 
The following theorem summarizes our results. 

Theorem 3: The average per user rate of the centralized 
protocol is strictly lower bounded by that for the distributed 
protocol and converges to this lower bound only in the limit 
of large K. 



h = 0.5 
Z-6 



100 1 50 200 

Number of Agents K 



- R Dislributed 

- R Centralized 



Fig. 1. Plot of per-user rate R and leakage Lf^ of any agent k vs. K. 



D. Outer Bounds 



From the symmetry of the model, it suffices to bound the 
rate Ri of agent 1 as 



n n 
>hiYi\Y2...YK) 



-EHYl.^\X2,^Y2,^ 



>hiYi\Y2...YK)--\ogi2TTeJ:) 



(34) 
■YK,^) (35) 
(36) 



where ( [35] ) results from the fact that X^, . . . can be esti- 
mated from Ji, Y^, . . . Y^, and that conditioning on only one 
of the estimates is a lower bound on Ri , and (|36] | results from 
using the fact that a jointly Gaussian distribution maximizes 
the differential entropy for a fixed variance, from the concavity 

\var (Yi\XiY2Y3 . . .Yk ^ 



of the log function for S = i? 
For jointly Gaussian (Yi, . . . , Yr-, 



X, 



we can write 



X2=Y2+j:l,^i^2bYi + Z 



(37) 



where Z ^ N (O, (t|) is independent of Y^ for all k, and 
from symmetry, we choose the same scaling constant b in ( |37] |. 

, 2 



For g = 



E[ X2-Y2- bYs 



bY, 



K 



ci — 0^9, and C2 = ci + (/3 — Pag) / (a — o?g) , we obtain 



^1 > 2 log 



log 



.fl{K,C2) 



(38) 
(39) 



where we have used the orthogonality of the minimum MSB 



(Xi - li) Yi 



0, 



estimate and the measurements, i.e., E 
for alH 7^ 1, and the distortion constraint in (|5]l 

With X2 in (|37] |, one can similarly bound l'^^ = i^^-* (from 



symmetry), for all j, as 

Ri > -I{X[';Y2^JiJ2.-.Jk) 
n 

>h{Xi)-^ log (2TreE var lYa^i 
= ^ log Ui / ( (1 ^ ^xll) 91 - - 92 



(40) 
(41) 



where gi 



E 



(if 2) 6/3/2 + 4-1)-!" 



X2 — Y2 

2- 
z 



= {b^{K-l)a + 



gi = a-gi6^r (is:-l)7 and 

92 = 9ib^ (l + {K- 2) Vh) 13 [K ~ 1) 



(42) 
(43) 



Remark 2: Due to the lack of a pre-log factor K, the per- 
user rate R for the outer bound rapidly approaches with K 
(relative to the inner bounds). 

The rate R and leakage Lk (for any k) as a function of K 
are illustrated in Fig. [T|for h ~ Q.h and Oq =Q. 

IV. Concluding Remarks 

We have introduced a distributed state estimation problem 
among K agents with fidelity and privacy constraints. We have 
shown that the sum-rate and per user rate achieved from a 
distributed protocol in which the agents directly interact taking 
into account the prior knowledge at all agents lower bounds 
those achieved by a centralized protocol with convergence 
for very large K. Tighter outer bounds that account for the 
distributed coding are much needed. 
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